Skip to content

fix(rootfs/qemu-static): use direct armhf exec probe instead of broken arch-test gate#9820

Open
iav wants to merge 1 commit into
armbian:mainfrom
iav:fix/native-armhf-detection
Open

fix(rootfs/qemu-static): use direct armhf exec probe instead of broken arch-test gate#9820
iav wants to merge 1 commit into
armbian:mainfrom
iav:fix/native-armhf-detection

Conversation

@iav
Copy link
Copy Markdown
Contributor

@iav iav commented May 14, 2026

Problem

prepare_host_binfmt_qemu_cross_arm64_host_armhf_target (added in 6755e9190, 2024-12-28, for Apple Silicon) gates on arch-test arm. On success → skip qemu-arm import; on failure → write a hand-rolled /usr/share/binfmts/qemu-arm magic with interpreter /usr/bin/qemu-arm-static and enable it.

This is broken on most modern aarch64 hosts:

  • arch-test arm probes ARMv5 EABI. Current toolchains and the kernel's COMPAT layer target armhf (EABI v5+); the two surfaces diverge.
  • On Ubuntu Noble 6.8 / Ampere Altra (Hetzner CAX), arch-test arm returns failure even with CONFIG_COMPAT=y and a fully-working COMPAT path. The gate then unconditionally routes every aarch64 host through the Apple-Silicon import branch — qemu-arm gets registered, mmdebstrap and chroot apt-get / dpkg --configure / customize_image run through qemu-user-static emulation, ~10× slower than native.
  • The unconditional update-binfmts --enable qemu-arm at function entry could re-enable a pre-existing registration even on hosts the operator deliberately wanted to keep without qemu.
  • Hand-rolling the descriptor overwrites a packaged /usr/share/binfmts/qemu-arm (resolute's qemu-user-binfmt ships /usr/bin/qemu-arm without -static) → enable fails → armhf builds break on resolute Apple-Silicon-like hosts.

Empirically: helios4 BUILD_MINIMAL=yes RELEASE=trixie PREFER_DOCKER=yes finishes in 1:07 min on Hetzner CAX21 when qemu-arm stays unregistered (mmdebstrap native via binfmt_elf), vs ~10× longer when emulation routes mmdebstrap.

Fix

Replace the single arch-test arm gate with a 4-tier detection that picks the safest path on each host:

  1. /usr/share/binfmts/qemu-arm exists (packaged by qemu-user-binfmt on resolute, or by qemu-user-static install): update-binfmts --enable qemu-arm — loads the packaged descriptor into the kernel. Preserves the packaged interpreter path. Hard-error if enable fails (descriptor present but broken — host needs operator attention).
  2. /proc/sys/fs/binfmt_misc/qemu-arm exists without descriptor: trust if enabled, hard-error if disabled (no descriptor to re-enable from).
  3. No registration: probe with a direct exec /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 --help (ld-linux-armhf is shipped by gcc-arm-linux-gnueabihf, an armbian aarch64 host build dep). Authoritative test of CONFIG_COMPAT functionality. If succeeds → kernel runs armhf natively → no qemu setup needed.
  4. Native probe fails (Apple Silicon / no COMPAT): hand-roll /usr/share/binfmts/qemu-arm magic with /usr/bin/qemu-arm-static, import, enable — the original behaviour, now only reached when actually needed.

Also removed the unconditional update-binfmts --enable qemu-arm "|| true" at function entry — its job is subsumed by step (1).

Empirical validation

  • Native aarch64 (CAX21 / Ampere Altra, Ubuntu Noble 6.8): probe succeeds, no qemu-arm registered, helios4 trixie minimal builds 0:57 (PREFER_DOCKER=no) / 1:07 (PREFER_DOCKER=yes), mmdebstrap native.
  • No-CONFIG_COMPAT kernel (custom helios64 build with opts_n+=("COMPAT" "COMPAT_VDSO" "ARM64_32BIT_EL0")): ld-linux-armhf.so.3 execs with Exec format error; probe correctly fails → step (4) fires.
  • Pre-existing qemu-arm registration (manually added /usr/share/binfmts/qemu-arm with /usr/bin/qemu-arm-static): step (1) early-returns, descriptor md5 unchanged before/after, kernel entry left enabled.

Test plan

  • Hetzner CAX21 Noble 6.8 — native aarch64 with COMPAT, mmdebstrap native, helios4 image OK.
  • helios64 no-COMPAT kernel — ld-linux-armhf exec → "Exec format error", step (4) reachable.
  • Manual qemu-arm registration → step (1) preserves descriptor + enabled state, no clobber.
  • Resolute aarch64 with qemu-user-binfmt package — covered by inspection (step 1 enables packaged descriptor). No resolute Apple Silicon hardware available; happy to defer to community verification.

Summary by CodeRabbit

  • Bug Fixes

    • More reliable ARM cross-execution setup with early exit when already enabled.
    • Clearer, actionable error messages and retry guidance when emulation cannot be enabled.
    • Improved detection of native vs. emulated execution to avoid unnecessary emulation.
  • Refactor

    • Deterministic multi-path enablement flow that prefers native support or packaged descriptors.
    • Runtime correctness probe now debug-only and non-failing; success cache hit logged unconditionally.

Review Change Stack

@iav iav requested a review from a team as a code owner May 14, 2026 00:19
@iav iav requested review from ColorfulRhino and schwar3kat and removed request for a team May 14, 2026 00:19
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 04b3ff30-64f5-4e8f-9ea7-a53e6fb616d2

📥 Commits

Reviewing files that changed from the base of the PR and between 1aed440 and 1d6d472.

📒 Files selected for processing (1)
  • lib/functions/rootfs/qemu-static.sh

📝 Walkthrough

Walkthrough

Probe for native armhf execution; if absent, prefer a packaged /usr/share/binfmts/qemu-arm or re-enable the kernel /proc entry, otherwise write/import/enable a qemu-arm descriptor; post-setup arch-test is debug-only.

Changes

QEMU ARMhf native capability probe and setup

Layer / File(s) Summary
Trust existing enabled binfmt_misc entry
lib/functions/rootfs/qemu-static.sh
Return early when /proc/sys/fs/binfmt_misc/qemu-arm exists and its state is enabled.
Probe armhf dynamic loader for native support
lib/functions/rootfs/qemu-static.sh
Execute ld-linux-armhf.so.3 --help as an execution probe; if it succeeds, return without configuring qemu-arm; fall back to arch-test arm only when probe binary is missing and target differs.
Enable packaged qemu-arm descriptor
lib/functions/rootfs/qemu-static.sh
If /usr/share/binfmts/qemu-arm exists, run update-binfmts --import/--enable qemu-arm, sync the kernel entry, and abort with reinstall/remove-and-retry instructions if it does not become enabled.
Re-enable disabled kernel entry or abort
lib/functions/rootfs/qemu-static.sh
If kernel qemu-arm exists but is disabled and no packaged descriptor is present, attempt to write 1 into the /proc entry to enable it; abort with reinstall-and-retry instructions on failure.
Write, import and enable qemu-arm descriptor
lib/functions/rootfs/qemu-static.sh
On probe failure and no re-enable option, write /usr/share/binfmts/qemu-arm (interpreter /usr/bin/qemu-arm-static with ELF magic/mask/flags), then update-binfmts --import and --enable it.
Debug-only arch-test and final log
lib/functions/rootfs/qemu-static.sh
Under SHOW_DEBUG=yes run arch-test (discarding failures) and emit a final success message stating arm emulation was set up via qemu-arm.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

I prod ld-linux to see if it will run,
If not, I coax binfmts until the job's done,
I import descriptors, nudge /proc to wake,
Then hop off content with each cross-build I make,
A rabbit cheers when emulation hums.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: replacing a broken arch-test gate with a direct armhf exec probe in the qemu-static setup function.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added size/medium PR with more then 50 and less then 250 lines 05 Milestone: Second quarter release Needs review Seeking for review Framework Framework components labels May 14, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
lib/functions/rootfs/qemu-static.sh (1)

232-233: 💤 Low value

Hardcode qemu-arm instead of inheriting wanted_arch from caller scope.

This function's name pins the target arch (armhf_target), and the only call site (line 187) enters when wanted_arch == "arm". Relying on the caller's loop variable here couples the helper to a particular invocation context without any guard — a future caller or refactor could silently produce qemu- with no arch suffix. Using the literal string also matches the heredoc above (line 222) which is already arm-specific.

♻️ Suggested change
-	run_host_command_logged update-binfmts --import "qemu-${wanted_arch}"
-	run_host_command_logged update-binfmts --enable "qemu-${wanted_arch}"
+	run_host_command_logged update-binfmts --import "qemu-arm"
+	run_host_command_logged update-binfmts --enable "qemu-arm"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/functions/rootfs/qemu-static.sh` around lines 232 - 233, The helper
function armhf_target currently calls run_host_command_logged with
"qemu-${wanted_arch}", inheriting wanted_arch from caller scope; change these
two calls to use the literal "qemu-arm" instead of interpolating wanted_arch so
the helper is self-contained and matches the arm-specific heredoc and intent.
Update both run_host_command_logged update-binfmts --import and --enable
invocations to use "qemu-arm" and ensure no other references inside armhf_target
rely on wanted_arch.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@lib/functions/rootfs/qemu-static.sh`:
- Around line 232-233: The helper function armhf_target currently calls
run_host_command_logged with "qemu-${wanted_arch}", inheriting wanted_arch from
caller scope; change these two calls to use the literal "qemu-arm" instead of
interpolating wanted_arch so the helper is self-contained and matches the
arm-specific heredoc and intent. Update both run_host_command_logged
update-binfmts --import and --enable invocations to use "qemu-arm" and ensure no
other references inside armhf_target rely on wanted_arch.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0f4f347d-a3de-47de-91fe-fa6e149fa4d8

📥 Commits

Reviewing files that changed from the base of the PR and between 761d04e and 4a8209d.

📒 Files selected for processing (1)
  • lib/functions/rootfs/qemu-static.sh

@iav iav force-pushed the fix/native-armhf-detection branch 2 times, most recently from 934a2f0 to d5b76d4 Compare May 14, 2026 02:15
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
lib/functions/rootfs/qemu-static.sh (1)

218-226: 💤 Low value

Probe path is gated solely on the cross-toolchain loader, missing fallback for foreign architecture setups.

The probe at line 222 checks /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 (installed by gcc-arm-linux-gnueabihf). If this cross-toolchain package is absent—possible in pre-prepared hosts or manual foreign architecture setups—the probe fails silently and falls through to write a hand-rolled qemu-arm descriptor and force emulation, defeating this PR's optimization.

In normal builds, gcc-arm-linux-gnueabihf is installed during host preparation when the target includes armhf, so this is not a typical correctness issue. However, hosts with armhf added as a foreign dpkg architecture (without the cross-toolchain) could use the alternative loader path /lib/arm-linux-gnueabihf/ld-linux-armhf.so.3 to detect native capability without additional packages. Consider probing both paths to support this setup, or at minimum document the gcc-arm-linux-gnueabihf dependency at the call site so users debugging slow builds can identify whether they need to install it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/functions/rootfs/qemu-static.sh` around lines 218 - 226, The armhf kernel
probe relies only on /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 via the
armhf_probe check and will miss hosts that have armhf as a foreign dpkg
architecture where the loader lives at
/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3; update the probe logic used around
the armhf_probe variable to check both paths
("/usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3" and
"/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3") and invoke the same executable
--help test for whichever exists, falling back to the existing qemu-arm
descriptor logic only if neither path yields an executable; alternatively add a
clear comment/log entry (via display_alert) documenting the
gcc-arm-linux-gnueabihf dependency if you prefer not to add the second-path
probe.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@lib/functions/rootfs/qemu-static.sh`:
- Around line 218-226: The armhf kernel probe relies only on
/usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 via the armhf_probe check and
will miss hosts that have armhf as a foreign dpkg architecture where the loader
lives at /lib/arm-linux-gnueabihf/ld-linux-armhf.so.3; update the probe logic
used around the armhf_probe variable to check both paths
("/usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3" and
"/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3") and invoke the same executable
--help test for whichever exists, falling back to the existing qemu-arm
descriptor logic only if neither path yields an executable; alternatively add a
clear comment/log entry (via display_alert) documenting the
gcc-arm-linux-gnueabihf dependency if you prefer not to add the second-path
probe.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 688ab1df-44ba-4174-9a39-f917a6a0ebf8

📥 Commits

Reviewing files that changed from the base of the PR and between 934a2f0 and d5b76d4.

📒 Files selected for processing (1)
  • lib/functions/rootfs/qemu-static.sh

@iav iav force-pushed the fix/native-armhf-detection branch from d5b76d4 to 4be6288 Compare May 14, 2026 02:49
@iav
Copy link
Copy Markdown
Contributor Author

iav commented May 14, 2026

Re: nitpick "Probe path is gated solely on the cross-toolchain loader, missing fallback for foreign architecture setups."

Verified: gcc-arm-linux-gnueabihf is added to host_dependencies in lib/functions/host/prepare-host.sh:258-260 for any build with wanted_arch ∈ {armhf, all}, and install_host_dependencies runs (line 77 of prepare_host_noninteractive) before prepare_host_binfmt_qemu (line 98). The probe binary /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 is therefore guaranteed to be in place when this helper fires in the normal flow.

The "absent" case the nitpick raises applies only when the operator sets PRE_PREPARED_HOST=yes (or equivalent), which is an explicit opt-out from armbian's host-dep management — the admin is on the hook for installing the toolchain that matches their target arch. If we wanted to be helpful in that path, a clearer message would be a separate concern (a check around assert_prepared_host rather than in the binfmt helper). Leaving this PR scoped to the original gate-replacement.

@iav
Copy link
Copy Markdown
Contributor Author

iav commented May 14, 2026

@rpardini

@igorpecovnik
Copy link
Copy Markdown
Member

Should we merge this after 05 release?

@igorpecovnik igorpecovnik added 08 Milestone: Third quarter release and removed 05 Milestone: Second quarter release labels May 16, 2026
@iav
Copy link
Copy Markdown
Contributor Author

iav commented May 16, 2026

I think it could be done earlier.
But of course I don't see the whole situation.

@igorpecovnik
Copy link
Copy Markdown
Member

Depends how well is this tested? I want as less surprises as possible when hitting the "build all" button.

@iav
Copy link
Copy Markdown
Contributor Author

iav commented May 16, 2026

Indeed — my testing surface doesn't cover CI or the big builder fleet. Agree on holding the merge until builder hiccups can't threaten a release cut.

@igorpecovnik
Copy link
Copy Markdown
Member

Agree on holding the merge until builder hiccups can't threaten a release cut.

Yep. After release we can afford to have broken CI for days while before its really not needed :)

@iav iav force-pushed the fix/native-armhf-detection branch from 4be6288 to 53cb297 Compare May 20, 2026 00:32
@github-actions github-actions Bot added the 05 Milestone: Second quarter release label May 20, 2026
@iav iav force-pushed the fix/native-armhf-detection branch 4 times, most recently from 765d575 to 1aed440 Compare May 21, 2026 22:20
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
lib/functions/rootfs/qemu-static.sh (1)

261-274: 💤 Low value

Consider adding verification after hand-write path.

Unlike the packaged descriptor path (lines 239-244), there's no verification that qemu-arm actually became enabled in the kernel after import/enable. While set -e catches command failures, update-binfmts --enable could succeed without the kernel entry actually becoming enabled (e.g., if binfmt_misc is in an unexpected state).

♻️ Optional: Add verification for consistency
 	run_host_command_logged update-binfmts --import qemu-arm
 	run_host_command_logged update-binfmts --enable qemu-arm
+	if [[ ! -e /proc/sys/fs/binfmt_misc/qemu-arm ]] ||
+		[[ "$(head -n1 /proc/sys/fs/binfmt_misc/qemu-arm 2> /dev/null)" != "enabled" ]]; then
+		exit_with_error "Hand-written qemu-arm descriptor could not be enabled. Check binfmt_misc state."
+	fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/functions/rootfs/qemu-static.sh` around lines 261 - 274, The hand-written
qemu-arm descriptor path writes /usr/share/binfmts/qemu-arm and runs
update-binfmts --import/--enable but lacks post-enable verification; after
run_host_command_logged update-binfmts --enable qemu-arm add a check using
update-binfmts --display qemu-arm or by reading
/proc/sys/fs/binfmt_misc/qemu-arm (or /proc/sys/fs/binfmt_misc/status) to
confirm the entry is present and marked enabled, and if the check fails retry or
log an error and exit non-zero so the failure is not silently ignored (reference
the created file /usr/share/binfmts/qemu-arm and the commands update-binfmts
--import qemu-arm and update-binfmts --enable qemu-arm).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/functions/rootfs/qemu-static.sh`:
- Around line 276-279: The debug block is passing shell operators as literal
args to arch-test (run_host_command_logged arch-test "||" true); update the
if-body so the shell operator applies to the command exit instead of being
passed to arch-test—call run_host_command_logged arch-test and place || true
after that invocation (i.e., run_host_command_logged arch-test || true), keeping
the existing SHOW_DEBUG check and display_alert call unchanged.

---

Nitpick comments:
In `@lib/functions/rootfs/qemu-static.sh`:
- Around line 261-274: The hand-written qemu-arm descriptor path writes
/usr/share/binfmts/qemu-arm and runs update-binfmts --import/--enable but lacks
post-enable verification; after run_host_command_logged update-binfmts --enable
qemu-arm add a check using update-binfmts --display qemu-arm or by reading
/proc/sys/fs/binfmt_misc/qemu-arm (or /proc/sys/fs/binfmt_misc/status) to
confirm the entry is present and marked enabled, and if the check fails retry or
log an error and exit non-zero so the failure is not silently ignored (reference
the created file /usr/share/binfmts/qemu-arm and the commands update-binfmts
--import qemu-arm and update-binfmts --enable qemu-arm).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 23911c9e-9472-4915-8474-2947dc33f7b5

📥 Commits

Reviewing files that changed from the base of the PR and between 4be6288 and 1aed440.

📒 Files selected for processing (1)
  • lib/functions/rootfs/qemu-static.sh

Comment thread lib/functions/rootfs/qemu-static.sh
…n arch-test gate

`arch-test arm` probes ARMv5 EABI, divergent from kernel COMPAT (armhf
EABI v5+), and returns failure on modern aarch64 hosts (Ubuntu Noble
6.8 / Ampere Altra) with a fully working CONFIG_COMPAT path — routing
mmdebstrap through qemu-user emulation at ~10× slowdown. Replace with
a 4-tier detection: prefer the packaged `/usr/share/binfmts/qemu-arm`
descriptor if present, trust an already-enabled kernel registration,
otherwise probe COMPAT directly by exec'ing `ld-linux-armhf.so.3`,
falling back to the hand-rolled descriptor only when none applies.

Assisted-by: Claude:claude-opus-4.7
@iav iav closed this May 22, 2026
@iav iav deleted the fix/native-armhf-detection branch May 22, 2026 21:55
@iav iav restored the fix/native-armhf-detection branch May 22, 2026 21:56
@iav iav reopened this May 22, 2026
@iav iav force-pushed the fix/native-armhf-detection branch from 1aed440 to 1d6d472 Compare May 22, 2026 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

05 Milestone: Second quarter release 08 Milestone: Third quarter release Framework Framework components Needs review Seeking for review size/medium PR with more then 50 and less then 250 lines

Development

Successfully merging this pull request may close these issues.

2 participants